Please submit your answers on Gradescope as a PDF with pages matched to question answers.
One way to prepare your solutions to this homework is with R
Markdown, which provides a way to include mathematical notation, text,
code, and figures in a single document. A template .Rmd
file is available through D2L.
Make sure all solutions are clearly labeled, and please utilize the question pairing tool on Gradescope. You are encouraged to work together, but your solutions, code, plots, and wording should always be your own. Come and see me during office hours or schedule an appointment when you get stuck and can’t get unstuck.
[12 pts] The goal of this question is to show that if a factorial design is \(D\)-optimal for the “effects” coding, it is also optimal for the 0-1 coding.
Consider an experiment designed to fit an additive model for two factors, A and B, each with two levels.
The correct matrix is shown at the bottom of the following R chunk.
library(MASS)
library(dplyr)
library(lmerTest)
levels <- factor(c("-","+"))
df <- expand.grid(A=levels,
B=levels,
C=levels)
formula <- ~ A + B + C
X_dm <- model.matrix(formula, data = df)
X_eff <- model.matrix(formula, data = df,
contrasts.arg = list(A = contr.helmert,
B = contr.helmert,
C = contr.helmert))
A <- round(ginv(X_eff)%*%X_dm,2)
X_eff%*%A # Correct output
## (Intercept) A+ B+ C+
## 1 1 0 0 0
## 2 1 1 0 0
## 3 1 0 1 0
## 4 1 1 1 0
## 5 1 0 0 1
## 6 1 1 0 1
## 7 1 0 1 1
## 8 1 1 1 1
A
## (Intercept) A+ B+ C+
## [1,] 1 0.5 0.5 0.5
## [2,] 0 0.5 0.0 0.0
## [3,] 0 0.0 0.5 0.0
## [4,] 0 0.0 0.0 0.5
\(X^{eff}A=X\) holds because we know that the Helmert Matrix can convert X to an orthogonal matrix (\(X^{eff}\)). Then, the reverse transformation would be converting \(X^{eff}\) back to X which requires the A matrix we solve for in part (a). This can generalize for other transformations on the model matrix as well (beyond effects and dummy coding).
Also, you can see that there is a positive relationship between \(|X'X|\) and \(|X'_{eff}X_{eff}|\) from the following. The AA’ matrix is positive definite, so the determinant must be positive as well. Therefore, minimizing \(|X'_{eff}X_{eff}|^{-1/2}\) will minimize \(|X'X|^{-1/2}\) , too.
\[ \begin{align*} \text{minimize } |X'X|^{-1/2} &\Leftrightarrow \text{ maximize } |X'X| \\ |X'X| &= |(A'X'_{\text{eff}})(X_{\text{eff}}A)| \\ &= |A'(X'_{\text{eff}}X_{\text{eff}})A| \\ &= |A||A'||X'_{\text{eff}}X_{\text{eff}}| \\ &= |AA'||X'_{\text{eff}}X_{\text{eff}}| \\ \end{align*} \]
Now consider an arbitrary experiment with \(k\) factors, each with its own number of levels, designed for any linear model with identifiable parameters.
Since D-Optimality is proportional to \(|X'X|^{-1/2}\) , and minimizing this value for effects coding and dummy coding yields the same value, it would not matter what the encoding type is. This also generalizes to other types of encoding. As explained in part (b), having some matrix A to transform between different types of encoding will produce a positive definite matrix. This means the determinant will be positive. This allows \(|X'X|^{-1/2}\) to be proportional to the the variance of the coefficient estimates no matter what the encoding type is. Therefore, encoding will not change the D-optimal design.
vibration <- read.csv("./Datasets/Q6-5.csv")
vibration <- vibration %>%
mutate_at(vars(-Vibration), as.factor)
vibration_aov <- aov(Vibration ~ Speed * Size, data = vibration)
summary(vibration_aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## Speed 1 227.3 227.3 38.02 4.83e-05 ***
## Size 1 1107.2 1107.2 185.25 1.17e-08 ***
## Speed:Size 1 303.6 303.6 50.80 1.20e-05 ***
## Residuals 12 71.7 6.0
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(vibration_aov, which = c(2,1))
plot(vibration$Size, resid(vibration_aov), main = "Size v. Residuals", xlab="Size", ylab="Resid")
plot(vibration$Speed, resid(vibration_aov), main = "Speed v. Residuals", xlab="Speed", ylab="Resid")
interaction.plot(vibration$Size, vibration$Speed, vibration$Vibration, type = "b",
pch = 1:6, col = RColorBrewer::brewer.pal(6, "Dark2"))
thickness <- read.csv("./Datasets/Q6-12.csv")
thickness <- thickness %>%
mutate_at(vars(-Thickness), as.factor)
# full data
thickness_aov <- aov(Thickness ~ Rate * Time, data = thickness,
contrasts = list(Rate = "contr.helmert",
Time = "contr.helmert"))
coef(thickness_aov) * 2
## (Intercept) Rate1 Time1 Rate1:Time1
## 29.02775 -0.31725 0.58600 0.28150
# removing outlier
thickness_aov_rm <- aov(Thickness ~ Rate * Time, data = thickness[c(-2, -15),],
contrasts = list(Rate = "contr.helmert",
Time = "contr.helmert"))
coef(thickness_aov_rm) * 2
## (Intercept) Rate1 Time1 Rate1:Time1
## 28.81595833 0.01920833 0.92245833 0.06970833
summary(thickness_aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## Rate 1 0.403 0.4026 1.262 0.2833
## Time 1 1.374 1.3736 4.305 0.0602 .
## Rate:Time 1 0.317 0.3170 0.994 0.3386
## Residuals 12 3.828 0.3190
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(thickness_aov_rm)
## Df Sum Sq Mean Sq F value Pr(>F)
## Rate 1 0.0444 0.0444 12.657 0.0052 **
## Time 1 2.9175 2.9175 832.554 5.82e-11 ***
## Rate:Time 1 0.0167 0.0167 4.754 0.0542 .
## Residuals 10 0.0350 0.0035
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coef(thickness_aov_rm)
## (Intercept) Rate1 Time1 Rate1:Time1
## 14.407979167 0.009604167 0.461229167 0.034854167
plot(thickness_aov, which = c(1,2))
plot(thickness$Rate, resid(thickness_aov), main = "Residuals v. Rate", xlab = "Rate", ylab = "Resid")
plot(thickness$Time, resid(thickness_aov), main = "Residuals v. Time", xlab = "Time", ylab = "Resid")
plot(thickness_aov_rm, which = c(1,2))
plot(thickness$Rate[c(-2,-15)], resid(thickness_aov_rm), main = "Residuals v. Rate (Outlier Removed)", xlab = "Rate", ylab = "Resid")
plot(thickness$Time[c(-2,-15)], resid(thickness_aov_rm), main = "Residuals v. Time (Outlier Removed)", xlab = "Time", ylab = "Resid")
# log
plot(aov(log(Thickness) ~ Rate*Time, data = thickness), 1:2)
# box-cox
bc <- boxcox(thickness_aov)
bc$x[which.max(bc$y)]
## [1] -2
plot(aov((Thickness)^-2 ~ Rate*Time, data = thickness), 1:2)
(4) [8 pts] (DAE 6.28)
yield <- read.csv("./Datasets/Q6-28.csv")
yield <- yield %>%
mutate_at(vars(-Yield), as.factor)
yield_aov <- aov(Yield ~ A*B*C*D, data = yield,
contrasts = list(A="contr.helmert",
B="contr.helmert",
C="contr.helmert",
D="contr.helmert"))
anova(yield_aov)
## Warning in anova.lm(yield_aov): ANOVA F-tests on an essentially perfect fit are
## unreliable
coefs <- coef(yield_aov)
# QQ plots of the coefficients.
qqnorm(coefs[-1] * 2, datax = T)
sort(coefs[-1]) # B seems to have the least effect
## A1:C1 B1:C1:D1 A1:B1 A1:C1:D1 B1:D1
## -2.125000e+00 -3.750000e-01 -3.750000e-01 -1.250000e-01 -6.595890e-16
## C1:D1 B1:C1 B1 A1:B1:D1 A1:B1:C1
## -2.984437e-16 1.250000e-01 2.500000e-01 3.750000e-01 5.000000e-01
## A1:B1:C1:D1 C1 D1 A1:D1 A1
## 5.000000e-01 1.000000e+00 1.625000e+00 2.000000e+00 2.250000e+00
yield_aov_ACD <- aov(Yield ~ A*C*D, data = yield,
contrasts = list(A="contr.helmert",
C="contr.helmert",
D="contr.helmert"))
anova(yield_aov_ACD)
coef(yield_aov_ACD)
## (Intercept) A1 C1 D1 A1:C1
## 1.737500e+01 2.250000e+00 1.000000e+00 1.625000e+00 -2.125000e+00
## A1:D1 C1:D1 A1:C1:D1
## 2.000000e+00 1.552447e-16 -1.250000e-01
plot(yield_aov_ACD, 1:2)
plot(yield$A, resid(yield_aov_ACD), main = "Residuals v. A", xlab="A", ylab="Resid")
plot(yield$C, resid(yield_aov_ACD), main = "Residuals v. C", xlab="C", ylab="Resid")
plot(yield$D, resid(yield_aov_ACD), main = "Residuals v. D", xlab="B", ylab="Resid")
\[ \\ \]
\[ \\ \]
\[ \\ \]
\[ \\ \]
\[ \\ \]
\[ \\ \]
\[ \\ \]
\[ \\ \]
perform_mean_range <- function(x) return(c(mean(x), diff(x)))
perform_mean_range(yield[yield$A==1 & yield$B==1 & yield$C==1,"Yield"])
## [1] 19 8
perform_mean_range(yield[yield$A==1 & yield$B==1 & yield$C==-1,"Yield"])
## [1] 20 8
perform_mean_range(yield[yield$A==1 & yield$B==-1 & yield$C==1,"Yield"])
## [1] 18 6
perform_mean_range(yield[yield$A==1 & yield$B==-1 & yield$C==-1,"Yield"])
## [1] 21.5 7.0
perform_mean_range(yield[yield$A==-1 & yield$B==1 & yield$C==1,"Yield"])
## [1] 18.5 -3.0
perform_mean_range(yield[yield$A==-1 & yield$B==1 & yield$C==-1,"Yield"])
## [1] 13 0
perform_mean_range(yield[yield$A==-1 & yield$B==-1 & yield$C==1,"Yield"])
## [1] 18 2
perform_mean_range(yield[yield$A==-1 & yield$B==-1 & yield$C==-1,"Yield"])
## [1] 11 -2
brownies <- read.csv("./Datasets/brownies.csv")
brownies <- brownies %>%
mutate_at(vars(-c(X,score)), as.factor)
brownies_aov <- aov(score~ A + B + C, data = brownies)
summary(brownies_aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## A 1 72.3 72.25 12.699 0.000725 ***
## B 1 18.1 18.06 3.175 0.079849 .
## C 1 0.1 0.06 0.011 0.916877
## Residuals 60 341.4 5.69
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coef(brownies_aov)
## (Intercept) A+ B+ C+
## 10.0000 2.1250 1.0625 -0.0625
plot(brownies_aov, 1:2)
plot(brownies$A, resid(brownies_aov), main = "Residuals v. A", xlab="A", ylab="Resid")
plot(brownies$B, resid(brownies_aov), main = "Residuals v. B", xlab="B", ylab="Resid")
plot(brownies$C, resid(brownies_aov), main = "Residuals v. C", xlab="C", ylab="Resid")
[2 pts] The analysis in part (a) would not be the correct approach because there is only one batch per factor combination. There are 8 people trying each batch, so there are no true replicates of each factor combination. The test panel should really be treated as a random effect and nuisance factor in the design.
[2 pts] The average and standard deviation analyses show that none of the factors are significant. This is a more appropriate approach because this treats each batch as having one replicate with some average rating and some standard deviation. The standard deviation analysis can give some evidence that the random effect of Panel_ID may not be significant if it were to be included. This means that while the analysis with mean/standard deviation may be more appropriate, the anlysis in part (a) may be a better test as a follow-up. It has more power and could possibly be treated as replicates since there is not a significant difference in the panel’s scoring.
head(brownies)
brownies_summary <- brownies[1:8, c("A","B","C")]
brownies_summary$mean <- tapply(brownies$score, brownies$Panel_ID, mean)
brownies_summary$sd <- tapply(brownies$score, brownies$Panel_ID, sd)
brownies_mean_aov <- aov(mean ~ A + B + C, data = brownies_summary)
brownies_sd_aov <- aov(sd ~ A + B + C, data = brownies_summary)
summary(brownies_mean_aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## A 1 0.070 0.070 0.072 0.8023
## B 1 1.125 1.125 1.145 0.3448
## C 1 4.500 4.500 4.581 0.0991 .
## Residuals 4 3.930 0.982
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(brownies_sd_aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## A 1 0.4684 0.4684 2.216 0.211
## B 1 0.0001 0.0001 0.000 0.984
## C 1 0.2316 0.2316 1.096 0.354
## Residuals 4 0.8456 0.2114
Diagnostic plots shown below reveal normal residuals and constant variance.
plot(brownies_mean_aov, 1:2)
plot(brownies_summary$A, resid(brownies_mean_aov), main = "Residuals v. A", xlab="A", ylab="Resid")
plot(brownies_summary$B, resid(brownies_mean_aov), main = "Residuals v. B", xlab="B", ylab="Resid")
plot(brownies_summary$C, resid(brownies_mean_aov), main = "Residuals v. C", xlab="C", ylab="Resid")
plot(brownies_sd_aov, 1:2)
plot(brownies_summary$A, resid(brownies_sd_aov), main = "Residuals v. A", xlab="A", ylab="Resid")
plot(brownies_summary$B, resid(brownies_sd_aov), main = "Residuals v. B", xlab="B", ylab="Resid")
plot(brownies_summary$C, resid(brownies_sd_aov), main = "Residuals v. C", xlab="C", ylab="Resid")
brownies_random_aov <- lmer(score ~ A + B + C + (1 | Panel_ID), data = brownies)
anova(brownies_random_aov)
ranova(brownies_random_aov)
coef(brownies_random_aov)
## $Panel_ID
## (Intercept) A+ B+ C+
## 1 10.375737 2.125 1.0625 -0.0625
## 2 10.239105 2.125 1.0625 -0.0625
## 3 11.127211 2.125 1.0625 -0.0625
## 4 9.897526 2.125 1.0625 -0.0625
## 5 9.214368 2.125 1.0625 -0.0625
## 6 9.351000 2.125 1.0625 -0.0625
## 7 9.487631 2.125 1.0625 -0.0625
## 8 10.307421 2.125 1.0625 -0.0625
##
## attr(,"class")
## [1] "coef.mer"